Advanced Filtering Options | Contains, Matches and Extract Fields
Accomplishing in-depth packet analysis sometimes ends up with a special filtering requirement that cannot be covered with default filters. TShark supports Wireshark's "contains" and "matches" operators, which are the key to the advanced filtering options. You can visit the Wireshark: Packet Operations room (Task 6) if you are unfamiliar with these filters.
A quick recap from the Wireshark: Packet Operations room:
| Filter | Details |
|---|---|
| Contains | - Search a value inside packets. - Case sensitive. - Similar to Wireshark's "find" option. |
| Matches | - Search a pattern inside packets. - Supports regex. - Case insensitive. - Complex queries have a margin of error. |
Note: The "contains" and "matches" operators cannot be used with fields consisting of "integer" values.
Tip: Using HEX and regex values instead of ASCII always has a better chance of a match.
Extract Fields
This option helps analysts to extract specific parts of data from the packets. In this way, analysts have the opportunity to collect and correlate various fields from the packets. It also helps analysts manage the query output on the terminal. The query structure is explained in the table given below.
| Main Filter | Target Field | Show Field Name |
|---|---|---|
| -T fields | -e |
-E header=y |
Note: You need to use the -e parameter for each field you want to display.
You can filter any field by using the field names as shown below.
-T fields -e ip.src -e ip.dst -E header=y
Filter: "contains"
| Filter | contains |
|---|---|
| Type | Comparison operator |
| Description | Search a value inside packets. It is case-sensitive and provides similar functionality to the "Find" option by focusing on a specific field. |
| Example | Find all "Apache" servers. |
| Workflow | List all HTTP packets where the "server" field contains the "Apache" keyword. |
| Usage | http.server contains "Apache" |
E.g
Contains filter
user@ubuntu$ tshark -r demo.pcapng -Y 'http.server contains "Apache"'
38 4.846969 65.208.228.223 ? 145.254.160.237 HTTP/XML HTTP/1.1 200 OK
user@ubuntu$ tshark -r demo.pcapng -Y 'http.server contains "Apache"' -T fields -e ip.src -e ip.dst -e http.server -E header=y
ip.src ip.dst http.server
65.208.228.223 145.254.160.237 Apache
Filter: "matches"
| Filter | matches |
|---|---|
| Type | Comparison operator |
| Description | Search a pattern of a regular expression. It is case-insensitive, and complex queries have a margin of error. |
| Example | Find all .php and .html pages. |
| Workflow | List all HTTP packets where the "request method" field matches the keywords "GET" or "POST". |
| Usage | http.request.method matches "(GET|POST)" |
E.g
Matches filter
user@ubuntu$ tshark -r demo.pcapng -Y 'http.request.method matches "(GET|POST)"'
4 0.911310 145.254.160.237 ? 65.208.228.223 HTTP GET /download.html HTTP/1.1
18 2.984291 145.254.160.237 ? 216.239.59.99 HTTP GET /pagead/ads?client=ca-pub-2309191948673629&random=1084443430285&
user@ubuntu$ tshark -r demo.pcapng -Y 'http.request.method matches "(GET|POST)"' -T fields -e ip.src -e ip.dst -e http.request.method -E header=y
ip.src ip.dst http.request.method
145.254.160.237 65.208.228.223 GET
145.254.160.237 216.239.59.99 GET
Use Cases
When investigating a case, a security analyst should know how to extract hostnames, DNS queries, and user agents to hunt low-hanging fruits after viewing the statistics and creating an investigation plan. The most common four use cases for every security analyst are demonstrated below. If you want to learn more about the mentioned protocols and benefits of the extracted info, please refer to the Wireshark Traffic Analysis room.
Extract Hostnames
user@ubuntu$ tshark -r hostnames.pcapng -T fields -e dhcp.option.hostname
92-rkd
92-rkd
T3400
T3400
60-alfb-sec2
60-alfb-sec2
aminott
...
The above example shows how to extract hostnames from DHCP packets with TShark. However, the output is hard to manage when multiple duplicate values exist. A skilled analyst should know how to use native Linux tools/utilities to manage and organise the command line output, as shown below.
Extract hostnames
user@ubuntu$ tshark -r hostnames.pcapng -T fields -e dhcp.option.hostname | awk NF | sort -r | uniq -c | sort -r
26 202-ac
18 92-rkd
14 93-sts-sec
...
Now the output is organised and ready to process/use. The logic of the query is explained below.
| Query | Purpose |
|---|---|
tshark -r hostnames.pcapng -T fields -e dhcp.option.hostname |
Main query. Extract the DHCP hostname value. |
awk NF |
Remove empty lines. |
sort -r |
Sort recursively before handling the values. |
uniq -c |
Show unique values, but calculate and show the number of occurrences. |
sort -r |
The final sort process. Show the output/results from high occurrences to less. |
Extract DNS Queries
Matches filter
user@ubuntu$ tshark -r dns-queries.pcap -T fields -e dns.qry.name | awk NF | sort -r | uniq -c | sort -r
96 connectivity-check.ubuntu.com.rhodes.edu
94 connectivity-check.ubuntu.com
8 3.57.20.10.in-addr.arpa
4 e.9.d.b.c.9.d.7.1.b.0.f.a.2.0.2.0.0.0.0.0.0.0.0.0.0.0.0.0.8.e.f.ip6.arpa
4 0.f.2.5.6.b.e.f.f.f.b.7.2.4.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.8.e.f.ip6.arpa
2 _ipps._tcp.local,_ipp._tcp.local
2 84.170.224.35.in-addr.arpa
2 22.2.10.10.in-addr.arpa
Extract User Agents
Matches filter
user@ubuntu$ tshark -r user-agents.pcap -T fields -e http.user_agent | awk NF | sort -r | uniq -c | sort -r
6 Mozilla/5.0 (Windows; U; Windows NT 6.4; en-US) AppleWebKit/534.10 (KHTML, like Gecko) Chrome/8.0.552.237 Safari/534.10
5 Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:100.0) Gecko/20100101 Firefox/100.0
5 Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.32 Safari/537.36
4 sqlmap/1.4#stable (http://sqlmap.org)
3 Wfuzz/2.7
3 Mozilla/5.0 (compatible; Nmap Scripting Engine; https://nmap.org/book/nse.html)
Extract Email Addresses with RegEx:
This command reads the packet capture file teamwork.pcap and displays detailed information about each packet. The -r option specifies the file to read from, and the -V option provides a verbose output, showing all the details of each packet. We then pipe this output to grep, which uses a regular expression to search for and extract email addresses from the detailed packet information.
tshark -r teamwork.pcap -V | grep -Eo '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}'
Extract Domain or Server info:
tshark -r directory-curiosity.pcap -T fields -e http.server | awk NF | sort -r | uniq -c | sort -r